On the development of new cosine-based probabilistic methods with applications to univariate and bivariate analyses of the wind speed energy

So far in the literature, a number of probability distributions have been successfully implemented for analyzing the wind speed and energy data sets. However, there is no published work on modeling and analyzing the wind speed and energy data sets with probability distributions that are introduced using trigonometric functions. In the existing literature, there is also a lack of studies on implementing the bivariate trigonometric-based probability distributions for modeling the wind speed and energy data sets. In this paper, we take up a meaningful effort to cover these interesting research gaps. Thus, we first incorporate a cosine function and introduce a new univariate probability distributional method, namely, a univariate modified cosine-G (UMC-G) family. Using the UMC-G method, a new probability distribution called a univariate modified cosine-Weibull (UMC-Weibull) distribution is studied. We apply the UMC-Weibull distribution for analyzing the wind energy data set taken from the weather station at Sotavento Galicia, Spain. Furthermore, we also introduce a bivariate version of the UMC-G method using the Farlie–Gumble–Morgenstern copula approach. The proposed bivariate distributional method is called a bivariate modified cosine-G (BMC-G) family. A special member of the BMC-G distributions called a bivariate modified cosine-Weibull (BMC-Weibull) distribution is introduced. We apply the BMC-Weibull distribution for analyzing the bivariate data set representing the wind speed and energy taken from the weather station at Sotavento Galicia. Using different statistical tools, we observe that the UMC-Weibull and BMC-Weibull are the best-suited models for analyzing the wind speed and energy data sets.


Introduction
Among the possible sources of energy, wind is one of the most useful ways of generating energy, termed wind energy.Wind energy provides many environmental and economic advantages as compared to other sources such as fossil fuel-based energy, which badly pollutes the atmosphere.Since wind speed is the most important and significant parameter of generating wind energy.Therefore, an accurate determination and the best possible selection of probability distributions for modeling and estimating the wind speed energy are very important [1][2][3][4].
Wind speed as an energy source is extremely valuable.Therefore, in recent years, there has been an increased and considerable interest in developing suitable probability distributions for predicting and modeling the energy output of wind energy conversion systems [5].In this regard, several new probability distributions are developed and implemented successfully for modeling the wind speed energy data sets.For example, Kantar et al. [6] introduced a modified version of the Lindley distribution called the extended generalized Lindley (EGL) distribution.The cumulative distribution function (CDF) of the EGL distribution is given by  () = 1 − ( 1 +  (1 + )  )  + 1  −(1+)  , , , ,  ∈ ℝ + .
They implemented the EGL distribution for analyzing the wind speed data taken from different locations in Turkey.Jia et al. [7] proposed another new probability distribution called the Topp-Leone Lindley (TLL) distribution with CDF , , ,  ∈ ℝ + .
They used the TLL distribution for analyzing the long-term measured wind speed data taken from different wind stations in China.
Ul-Haq et al. [8] introduced a new version of the power Lindley distribution using the alpha power transformation of Elbatal et al. [9].They called the proposed model a new alpha power transformed power Lindley (NAPTPL) distribution.The CDF of the NAPTPL distribution is , , , ,  ∈ ℝ + ,  ≠ 1.
The NAPTPL distribution was implemented for modeling the wind speed data taken from five different locations in Pakistan including Haripur, Gwadar, Quetta, Bahawalpur, and Peshawar.For more studies about the development of probability distributions for modeling the wind speed energy data sets, we refer to [10][11][12][13][14][15][16][17].
As we briefly discussed above that a series of probability distributions have been introduced and applied for analyzing the wind speed energy data sets.However, according to our deep search of the literature, we did not find any published paper related to wind speed energy data modeling using trigonometric-based probability distributions.Secondly, there is also no published paper related to wind speed and energy data modeling using bivariate trigonometric-based probability distributions.In this paper, we attempt to produce useful efforts to cover these two amazing and interesting research gaps.In the first attempt, we develop a univariate trigonometric-based distributional method for modeling the wind energy data.In the second attempt, we develop a bivariate trigonometric-based distributional method for analyzing the wind speed and energy data.
The rest parts of this paper are carried out as follows: a univariate trigonometric-based distributional method is introduced in Section 2. Furthermore, a special member of the univariate distributional method is also discussed.In Section 3, the proposed bivariate distributional method is provided.Its special member is also discussed in detail.In Section 4, two practical applications representing wind speed and energy data sets are analyzed.Some concluding remarks are provided in Section 5.

A univariate modified cosine-𝑮 method and its special case
Recent development in distribution theory via incorporating trigonometric functions has received considerable attention [18][19][20].In this section, we present our first proposal by introducing a new distributional method, namely, a univariate modified cosine- (UMC-) family of distributions.As its name suggests, the proposed method is obtained by incorporating a cosine function.Using the first proposal, we consider a special case to illustrate the applications of the UMC- distributions.

A univariate modified cosine-𝐺 method
Assume  is a UMC- distributed random variable defined on ℝ.The CDF of  , say  (), has the following form where Ḡ () = 1 −  () and  is an additional parameter.
The probability distribution function (PDF) of the UMC- models, say  (), is defined as where    () =  ().The survival function (SF) of  , say  (), is given by Whereas, the hazard function (HF) ℎ () and cumulative HF (CHF)  () of the MS- models are given, respectively, by and In the next subsection, we discuss a special case of the UMC- distributional method, namely, a univariate modified cosine-Weibull (UMC-Weibull) distribution.

A special case of the UMC-𝐺 method
In this section, we define the basic functions of the special member of the UMC- method.For this purpose, we consider the Weibull distribution with shape  and scale  parameters taken as a baseline model.
Assume  is a Weibull distributed random variable defined on ℝ + .The CDF  () and PDF  () of  are given, respectively, by and Using Eq. (2) in Eq. (1), we obtain the CDF and SF of the UMC-Weibull distribution given by and For different values of , , and , the CDF and SF plots of the UMC-Weibull distribution are provided in Fig. 1.The PDF of the UMC-Weibull distribution is

B. Alnssyan and M.A. Alomair
For various selected values of , , and , different PDF plots of the UMC-Weibull distribution are illustrated in Fig. 2. By taking the ratio of Eq. (3) and Eq. ( 4), we get the HF of the UMC-Weibull distribution given by For different chosen values of , , and , the HF plots of the UMC-Weibull distribution are visualized in Fig. 3. Furthermore, the cumulative hazard function (CHF) and reverse hazard function (RHF) of the UMC-Weibull distribution are given by and respectively.
For different selected values of , , and , the plots of the CHF of the UMC-Weibull distribution are presented in Fig. 4.

Table 1
Some special members of the UMC- distributional method.

S. No. Baseline Model
Baseline SF Ḡ () By incorporating the UMC- distributional method, a series of new modified distributions can be obtained.In this regard, some special members of the UMC- distributional method are presented in Table 1.

A bivariate modified cosine-𝑮 method and its special case
When there is a dependency between two variables, in such cases the bivariate distributions provide satisfactory results.The bivariate distributions have been successfully implemented for modeling real-life phenomena in reliability, hydrological, drought, sport, and many others.Due to satisfactory results of the bivariate distributions in these sectors, researchers have shown an increased interest in developing new bivariate distributions, see for example [21][22][23][24][25].
In the statistical literature, the Farlie-Gumbel-Morgenstern (FGM) has proven to be the most prominent approach for generating bivariate distributions.The introduction of the FGM distributions years back to mid 20th century due to Morgenstern [26].Later, a multivariate version of the FGM method was studied by Farlie [27]. Assume where , and . When,  = 0, Eq. ( 5) reduces to Corresponding to Eq. ( 5), the bivariate PDF  (  1 ,  2 ) of the FGM of distributions is given by ) .
In the next section, we use the FGM distributions approach to introduce the bivariate version of the UMC- distributions, namely, a bivariate modified cosine- (BMC-) distributions.

A bivariate modified cosine-𝐺 method
Assume  1 follows the UMC- models with CDF, say  (  1 ) , is given by and PDF  (  1 ) given by Now, assume  2 also follows the UMC- models with CDF, say , is given by and PDF given by Using Eq. ( 6) and Eq. ( 7) in Eq. ( 5), we get the CDF of the BMC- distributions given by ) .
The PDF of the BMC- distributions is given by ) sin In the next subsection, we provide a special member of the BMC- distributions, namely, a bivariate modified cosine-Weibull (BMC-Weibull) distribution.

A special case of the BMC-𝐺 method
In this section, we define a special member of the BMC- method.For this purpose, we again consider the Weibull model as a baseline distribution.
Assume  1 follows the Weibull distribution having CDF  (  1 ) with shape parameter  1 and scale parameter  1 .The CDF of  1 is given by and PDF B. Alnssyan and M.A. Alomair

and PDF
Using Eq. ( 9) and Eq.(10) in Eq. ( 8), we get the CDF of the BMC-Weibull distribution given by ) ) , and SF given by ) For different values of  1 ,  2 ,  1 ,  2 ,  1 ,  2 and , the plots for the CDF and SF of the BMC-Weibull distribution are presented in Figs. 5 and 6.The plots in Figs. 5 and 6 show that the BMC-Weibull distribution has a valid CDF, as the curves of these plots lie between zero and one.Corresponding to Eq. ( 11), the PDF of the BMC-Weibull distribution is given by

Univariate and bivariate analyses of the wind speed energy data sets
Basically, this section carries two aims and is divided into two subsections.In the first subsection, we apply the UMC-Weibull distribution for analyzing the wind energy data set.In the second subsection, we apply the BMC-Weibull distribution for analyzing the wind speed and energy data sets.

Univariate analysis of the wind energy data set
As we discussed above that this subsection offers the illustration of the UMC-Weibull distribution using the wind energy data set.For this purpose, we consider a practical application representing the wind energy recorded hourly at the weather station of Sotavento Galicia, Spain.The wind energy data set was recorded on July 19, 2023.This data set is also available at: http://www .sotaventogalicia .com/en /technical -area /real -time -data /historical/.The Sotavento Galicia weather station is located at 43.3544° N and 7.8812° W; see Fig. 13.

B. Alnssyan and M.A. Alomair
Some descriptive plots of the data set are presented in Fig. 14.These plots include the hourly (i) wind rose (top-left plot), (ii) produced energy (top-right plot), (iii) wind speed (m/s) (bottom-left plot), and (iv) degrees of the wind (bottom-right plot).Furthermore, the kernel density, histogram, box plot, and violin plots of the wing energy data are provided in Fig. 15.
Using the wind energy data set, we prove the applicability of the UMC-Weibull distribution by comparing its fitting power with some prominent probability distributions.The CDFs of these probability distributions are given by • Flexible Weibull (F-Weibull) distribution of Bebbington et al. [28] • New modified flexible Weibull (NMF-Weibull) distribution of Ahmad et al. [29] ) , , ,  ∈ ℝ + ,  > 1.
. • The Anderson-Darling test expressed by  * and is calculated as • The Kolmogorov-Smirnov test represented by  * and is obtained as In the above formulas of the statistical tests, the terms   and  represent, respectively, the  ℎ observation and size of the data.The terms  (   ) and  (  −+1 ) represent the CDFs of the competing distribution corresponding to the  ℎ and ( −  + 1) ℎ observations of the data, respectively.The   () and  () represent the empirical CDF and CDF of the competing distribution, respectively.Whereas,   represents the set of differences between   () and  ().
The numerical results of the aforesaid statistical tests are obtained by implementing the -script Adequacy Model with  algorithm.
For a given data set, a probability model with the smallest values of the above statistical tests represents the best-suited model.Using the wind energy data set, the values of the maximum likelihood estimators (MLEs) τ , β , γ , σ , η , and λ of the above competing distributions are reported in Table 2. Furthermore, for the wind energy data set, the comparative results (values of the statistical tests) of the fitted distributions are reported in Table 3.According to the values of  * ,  * , and  * in Table 3, the best-suited model for the wind energy data set is the UMC-Weibull distribution.Finally, Fig. 16 shows the fitted PDF, SF, QQ, and CDF of the UMC-Weibull distribution.These fitted plots suggest that the UMC-Weibull distribution provides the best-suited fit for the wind energy data set.

Bivariate analysis of the wind speed and energy data sets
In this section, we apply the BMC-Weibull distribution for analyzing the bivariate wind speed and energy data sets.The wind speed and energy data sets are available at: http://www .sotaventogalicia.com/en /technical -area /real -time -data /historical/.We represent the energy and wind speed data sets by  1 and  2 , respectively.Some descriptive measures of the wind speed and energy data sets are presented in Table 4.

Testing of normality of the wind speed and energy data sets
An assessment of the normality of data is a requirement for many statistical tests because normal data is a basic assumption in parametric testing.In this section, we check the wind speed and energy data sets using two well-known statistical tests called the Shapiro-Wilk (SW) normality test and the Anderson-Darling (AD) normality test.

• The SW normality test
In this section, we apply the SW normality test by constructing a hypothesis as follows  0 : The data is normally distributed vs  1 : The data is not normally distributed.
After performing the analysis using the SW normality test, we observe that the SW values of  1 and  2 are given, respectively, by 0.93711 and 0.93957.The p-values corresponding to  1 and  2 are given by 0.1406 and 0.1596.Since the p-values for both  1 and  2 are greater than 0.05, therefore, we can assume the normality of the wind speed and energy data sets.

• The AD normality test
Here, we perform the second normality test of the wind speed and energy data sets using the AD normality test.The AD normality test can be performed by formulating the following hypothesis:  0 : The data is normally distributed vs  1 : The data is not normally distributed.After carrying out the numerical analysis using the AD normality test, we observe that the AD values of  1 and  2 are given, respectively, by 0.52547 and 0.47317.The p-values corresponding to  1 and  2 are given by 0.1623 and 0.2211.As we can see that the p-values for both  1 and  2 are greater than 0.05, therefore, we fail to reject  0 and assume the normality of the wind speed and energy data sets.

Modelling the wind speed and energy data sets
The prime interest of the development of the BMC-Weibull distribution is to be implemented for analyzing in applied sectors.In this section, we illustrate this fact by applying the BMC-Weibull distribution for analyzing the wind speed and energy data sets.The comparison of the BMC-Weibull distribution is made with the • Farlie-Gumbel-Morgenstern bivariate Weibull (FGMB-Weibull) distribution of Almetwally et al. [32], and • Farlie-Gumble-Morgenstern new heavy-tailed Weibull (FGMNHT-Weibull) distribution of Almaspoor and Tahmasebi [33].
Among the above distributions, the decision about the best-suited distribution for wind speed and energy data sets is made using four statistical criteria.These criteria are given by • Akaike information criterion (AIC) 2 − 2 (.) .
• Consistent Akaike Information Criterion (CAIC) In the above formulas of information criteria,  (.) denotes the LLF,  denotes the number of parameters, and  represents the size of the sample.A probability distribution with the largest values of the information criteria is considered the worst performance; while a probability distribution with the smallest values of the information criteria represents the best-suited distribution for the underlined data set.The MLEs and values of the above information criteria for the fitted distributions are obtained using the - with ().
Corresponding to the wind speed and energy data sets, the values of τ1 , τ2 , β1 , β2 , γ1 , γ2 , θ1 , θ2 , and δ of the fitted distributions are provided in Table 5.The values of the considered information criteria of the fitted distributions are provided in Table 6.Corresponding to the given results in Table 6, we can see that the proposed BMC-Weibull distribution has the lowest values of the information criteria.For the BMC-Weibull distribution, the values of the information are given by: AIC = 287.7830,BIC = 262.0300,CAIC = 260.7830,and HQIC = 255.9710.These facts reveal that the BMC-Weibull distribution is the best-suited model for analyzing the wind speed and energy data sets.For the wind speed and energy data sets, the second best-suited model is the FGMNHT-Weibull distribution with AIC = 294.9752,BIC = 267.0131,CAIC = 271.8447,and HQIC = 260.2473.

Concluding remarks
In this paper, two new statistical methods were studied for generating new probability distributions.The first method was called a UMC- family of distributions.It was introduced for generating new univariate probability distributions.Using the proposed UMC-, a new version of the Weibull model called the UMC-Weibull distribution was studied.The effectiveness and application of the UMC-Weibull distribution were shown by analyzing the wind energy data set.The second method was called a BMC- family of distributions.Based on the BMC- method, a bivariate version of the UMC-Weibull model called a BMC-Weibull distribution, was introduced.The BMC-Weibull distribution was applied to the wind speed and energy data sets, and its comparison was made with other bivariate distributions.By considering certain statistical tests, we observed that the BMC-Weibull distribution is the best-suited model for analyzing the wind speed and energy data sets.
In the future, we are motivated to introduce the neutrosophic versions of the UMC-Weibull and BMC-Weibull distributions.Different estimation methods may be incorporated to estimate the parameters of the UMC-Weibull and BMC-Weibull distributions.Furthermore, other trigonometric functions should be considered to introduce new probability distributions for modeling the wind speed and energy data sets.

Fig. 1 .
Fig. 1.Graphical illustrations of the (a) CDF and (b) SF of the UMC-Weibull distribution.

Fig. 4 .
Fig. 4. Graphical illustrations of the (a) CHF with red color, (b) CHF with green color, and (c) CHF with black color of the UMC-Weibull distribution.

Fig. 13 .
Fig. 13.The location of the wind energy station at Sotavento Galicia, Spain.

Fig. 14 .
Fig. 14.The visual illustration of the (a) wind rose, (b) energy, (c) wind speed, and (d) degrees of the wind.

Fig. 15 .
Fig. 15.The plots of the (a) kernel density, (b) histogram, (c) box plot, and (d) violin plot of the wind energy data.

Fig. 16 .
Fig. 16.The plots of the (a) fitted PDF, (b) fitted CDF, (c) fitted SF, and (d) QQ function of the UMC-Weibull distribution using the wind energy data.

Table 2
The MLEs of the fitted distributions using the wind energy data recorded onJuly 19, 2023.

Table 3
The values of the statistical tests of the fitted distributions using the wind energy data recorded on July 19, 2023.

Table 4
Descriptive measures of the wind speed and energy data sets.

Table 5
The MLEs of the fitted distributions using the wind speed and energy data sets recorded on July 19, 2023.

Table 6
The values of the statistical tests of the fitted distributions using the wind speed and energy data sets recorded on July 19, 2023.